Canonicalization of Feature Parameters for Robust Speech Recognition Based on Distinctive Phonetic Feature (DPF) Vectors

نویسندگان

  • Mohammad Nurul Huda
  • Muhammad Ghulam
  • Takashi Fukuda
  • Kouichi Katsurada
  • Tsuneo Nitta
چکیده

This paper describes a robust automatic speech recognition (ASR) system with less computation. Acoustic models of a hidden Markov model (HMM)-based classifier include various types of hidden factors such as speaker-specific characteristics, coarticulation, and an acoustic environment, etc. If there exists a canonicalization process that can recover the degraded margin of acoustic likelihoods between correct phonemes and other ones caused by hidden factors, the robustness of ASR systems can be improved. In this paper, we introduce a canonicalization method that is composed of multiple distinctive phonetic feature (DPF) extractors corresponding to each hidden factor canonicalization, and a DPF selector which selects an optimum DPF vector as an input of the HMM-based classifier. The proposed method resolves gender factors and speaker variability, and eliminates noise factors by applying the canonicalzation based on the DPF extractors and two-stage Wiener filtering. In the experiment on AURORA2J, the proposed method provides higher word accuracy under clean training and significant improvement of word accuracy in low signal-to-noise ratio (SNR) under multi-condition training compared to a standard ASR system with mel frequency ceptral coeffient (MFCC) parameters. Moreover, the proposed method requires a reduced, two-fifth, Gaussian mixture components and less memory to achieve accurate ASR. key words: automatic speech recognition, feature extraction, canonicalization, distinctive phonetic feature, hidden factor

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Canonicalization of feature parameters for automatic speech recognition

Acoustic models (AMs) of an HMM-based classifier include various types of hidden variables such as gender type, speaking rate, and acoustic environment. If there exists a canonicalization process that reduces the influence of the hidden variables from the AMs, a robust automatic speech recognition (ASR) system can be realized. In this paper, we describe the configuration of a canonicalization p...

متن کامل

Designing multiple distinctive phonetic feature extractors for canonicalization by using clustering technique

Acoustic models of an HMM-based classifier include various types of hidden factors such as speaker-specific characteristics and acoustic environments. If there exist a canonicalization process that represses the decrease of differences in acoustic-likelihood among categories resulted from hidden factors, a robust ASR system can be realized. We have previously proposed the canonicalization proce...

متن کامل

Noise-robust Automatic Speech Re Orthogonalized Distinctive Phoneti

With the aim of using an automatic speech recognition (ASR) system in practical environments, various approaches focused on noise-robustness such as noise adaptation and reduction techniques have been investigated. We have previously proposed a distinctive phonetic feature (DPF) parameter set for a noise-robust ASR system, which reduced the effect of high-level additive noise[1]. This paper des...

متن کامل

Distinctive Phonetic Feature (dpf) Based Phone Segmentation Using 2-stage Multilayer Neural Networks

Segmentation of speech into its corresponding phones has become very important issue in many speech processing areas such as speech recognition, speech analysis, speech synthesis, and speech database. In this paper, for accurate segmentation in speech recognition applications, we introduce Distinctive Phonetic Feature (DPF) based feature extraction using a two-stage MLN (Multi-Layer Neural Netw...

متن کامل

Distinctive phonetic feature (DPF) based phone segmentation using hybrid neural networks

Segmentation of speech into its corresponding phones has become very important issue in many speech processing areas such as speech recognition, speech analysis, speech synthesis, and speech database. In this paper, for accurate segmentation in speech recognition applications, we introduce Distinctive Phonetic Feature (DPF) based feature extraction using a twostage NN (Neural Networks) system c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IEICE Transactions

دوره 91-D  شماره 

صفحات  -

تاریخ انتشار 2008